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Abstract 

We study algorithmic randomness and monotone complexity on 
product of the set of infinite binary sequences. We explore the follow- 
ing problems: monotone complexity on product space, Lambalgen's 
theorem for correlated probability, classification of random sets by 
likelihood ratio tests, decomposition of complexity and independence, 
Bayesian statistics for individual random sequences. Formerly Lam- 
balgen's theorem for correlated probability is shown under a uniform 
computability assumption in [H. Takahashi Inform. Comp. 2008]. In 
this paper we show the theorem without the assumption. 
Keywords : Martin-L6f randomness, Kolmogorov complexity, Lam- 
balgen's Theorem, consistency, Bayesian statistics 

1 Introduction 

It is known that Martin-L6f random sequences [11] satisfy many laws of prob- 
ability one, for example ergodic theorem, martingale convergence theorem, 
and so on, see [SUCH]. In this paper, we study Martin-Lof random sequences 
with respect to a probability on product space Q x fi, where Q is the set of 
infinite binary sequences. In particular, we investigate the following prob- 
lems: 
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1. Randomness and monotone complexity on product space (Levin-Schnorr 
theorem for product space) 

2. Lambalgen's theorem [22] for correlated probability. 

3. Likelihood ratio test and classification of random sets. 

4. Decomposition of complexity and independence of individual random 
sequences. 

5. Bayesian statistics for individual random sequences. 

The above problems are property of product space except for 3. 

In Section [3j we show Lambalgen's theorem for correlated probability. In 
the previous paper [19] , the theorem is shown under a uniform computability 
assumption. In this paper, we show the theorem without that assumption. 
This is the main theorem of this paper (Theorem 13 . 3j) . 

The other sections are as follows: In Section (2j we define monotone com- 
plexity on product space. A usual definition of one-dimensional monotone 
complexity strongly depends on an order structure of one-dimensional space. 
In order to define monotone complexity on product space, we give an alge- 
braic definition of monotone function for product space, which is applicable, 
mutatis mutandis, to an abstract partially ordered set. In Section HI we show 
a classification of random sets by likelihood ratio tests. In particular we show 
an important theorem by Martin-Lof, i.e., two computable probabilities are 
mutually singular iff their random sets are disjoint. As a simple application, 
we show consistency of MDL for individual sequences. In Section [51 we show 
a decomposition of monotone complexity for prefixes of random sequences 
under a condition. As a corollary, we show some equivalent conditions for 
independence of individual random sequences. In Section El we apply our re- 
sults to Bayesian statistics. By virtue of randomness theory, we can develop 
a point-wise theory for Bayesian statistics. In particular, we show consis- 
tency of posterior distribution (and its equivalent conditions) for individual 
random sequences. In order to show this, the results of Section H] plays an im- 
portant role. Also we show an asymptotic theory of estimation for individual 
sequences, which is closely related to decomposition of complexity. 
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2 Randomness and complexity 



First we introduce Martin-L6f randomness on f2. Let S be the set of finite 
binary strings. Let Q be the set of infinite binary sequences with product 
topology. As in [19], we write A C B including A = B. Throughout the 
paper, the base of logarithm is 2. We use symbols such as x, y, s to denote 
an element of S and a: 00 ,?/ 00 to denote an element of Q. For x G S, let 
A(x) := {xu : u G Q}, where xu is the concatenation of x and u, and for 
x°° G Q, A(x°°) := Let A G S be the empty word, then A (A) = fi. 

For A C S, let 0"{A(x)}a; eJ 4 be the cr-algebra generated by {A(x)} x& a and 
B := cr{A(x)} xe s- Let (Q,B,P) be a probability space. We write P(x) : = 
P{A(x)) for x E S, then we have P(x) = P(xO) + P(xl) for all x. Let N, Q, 
and M be the set of natural numbers, rational numbers, and real numbers, 
respectively. P is called computable if there exists a computable function 
p : S x N -> Q such that Vx G SVA; G N \P(x) - p(x,k)\ < 1/k. A 
set A C S is called recursively enumerable (r.e.) if there is a computable 
function / : N -> 5 such that /(N) = A. For ic5, let A := U xeA A(x). A 
set [/ C N x S is called (Martin-L6f) test with respect to P if 1) U is r.e., 
2) U n+l C t7 n for all n, where f/ n = {x : (n,x) G [/}, and 3) P(U n ) < 2~ n . 
In the following, if P is obvious from the context, we say that U is a test. A 
test U is called universal if for any other test V, there is a constant c such 
that Vn V^ +c C [/„. 

Theorem 2.1 (Martin-Lof [TTj ) If P is a computable probability, a uni- 
versal test U exists. 

In [TTJ, the set (p^ =l U n ) c (complement of the limit of universal test) is defined 
to be random sequences with respect to P, where U is a universal test. 
We write 1Z P := (fl^ =1 L r n ) c . Note that for two universal tests U and V, 
n™ =1 U n = n^ =1 V n and hence 1Z P does not depend on the choice of a universal 
test. An equivalent definition of test is that U is r.e. and ^2 n P(U n ) < oo. 
Then the set covered by U n infinitely many times is a limit of a test, i.e., 
limsup n [/ n C (JZ P ) C 1 see [TTj . 

For x,y G S, let A(x,y) := A(x) x A(y). Let B S 2 := a{A(x, y)\x, y G S}. 
Then computability of P on (f2 2 ,£>s2), its Martin-Lof tests, and the set of 
random sequences are defined similarly. 



3 



2.1 Complexity 



For x',x G 5U0, we write x' C x ^ x' is a prefix of x A (a/) D A(x), 
and for (x' ,y'), (x,y) G (5 U f2) 2 , (x',y') C (x, j/) x' C x and y' ^ y 

A(x', y') D A(x, ?/). Then S 1 U f2 and (5 U f2) 2 are partially ordered sets. 
For v4 C S* 2 , let \/A be the least upper bound of A. Then \/v4 exists in 
(S U f2) 2 iff r\r x ,y)eAA(x,y) ^ 0. In the following bold-faced symbols x, y, p 
denote an element of (S U Q) 2 , x°° denote an element of Q 2 , and A = (A, A). 

First we define monotone functions (S U Q) 2 — > (S U Q) 2 . 
Let F C S 2 x S 2 and F p := {x|(p,x) G F}. 
Assume that 

Vp G S* 2 , A G F p and \J F p , exists. (1) 

p'Ep 

Set 

/(p):= \/ F p / for p 6 (SU f2) 2 . (2) 
p'Cp, p'es 2 

We see that / : (S U f2) 2 — > (S U f2) 2 and / is monotone, i.e., 

p' E p =► /(pO E /(p). 

Conversely, let / : (S U f2) 2 — )■ (5 U f2) 2 be a monotone function, and set 

F:={(p,x)GS 2 xS 2 |xC/(p)}, 

Then \f F p = /(p), F satisfies (p^, and the function defined by F coincides 
with /. If F is a r.e. set, the function / defined by ([2]) is called computable 
monotone function. 

For s G S, let |s| be the length of s. In particular |A| = and |x°°| = oo. 
For p = (pi,P2) 6 (5U Q) 2 , let |p| := \pi\ + \p 2 \. The monotone complexity 
with respect to a computable monotone function / : (S U f2) 2 — >• (S* U f2) 2 is 
defined as follows: 

Km 2 (x,y) := min{|pi| + \p 2 \ \ {x,y) C /(pi,p 2 )}, 

Km f (x,y) :=min{|p| | (x, y) E/(p,A)}, 

for x,y,p,p±,p2 6 SUfl. If there is no (pi,£>2) such that (x, y) C f(pi,P2), 
then Km 2 (x,y) := oo. Similarly, Krrif(x,y) := oo if there is no p such that 
(z,Z/) E /(p, A). 
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A computable monotone function u : (SUQ) 2 — > (SUfi) 2 is called optimal 
if for any computable monotone function / : (S U Q) 2 — > (S U Q) 2 , there is 
a constant c such that Km^x.) < Km 2 (x.) + c for all x £ (S* U Q) 2 . We can 
construct an optimal function in the following manner. First, observe that 
there is a r.e. set F C N x S 2 x S 2 such that 1) F { = {(p,x)|(z, p,x) £ F} 
satisfies ([T]) for all z £ N, and 2) for each r.e. set F that satisfies ([T]), there 
is i such that F — F^. Note that the first condition in ([T]) is necessary 
to enumerate {Fi}. Next, set F" := {(zp, x)|(i, p, x) £ F}, where ip = 
(0*lpi,p2) for p = (pi,p2)- Let u be a computable monotone function defined 
by F u via (j2J), then we see that u is optimal. In the following discussion, we 
fix u and let 

Km 2 (x,y) := Km 2 u (x,y), Km(x,y) := Km u (x,y), 

Km{x\y) := min{|p| | (x, A) Qu(p,y)}, 

Km(x) := i^m(x|A) for x, y £ S 1 U Q. 

By definition, we have Vx, ?/, Km 2 (x,y) < Km(x,y). Note that fTm is 
equivalent to a monotone complexity that is defined from an optimal mono- 
tone function S U Q — > (S U Q) 2 . Also note that Km(x) defined above is 
different from Km 2 (x) := Km 2 (x, A). Later we show that Km 2 and Km are 
asymptotically bounded for prefixes of random sequences under a condition, 
see Corollary 12.11 

In the following, a subset A of SUQ or (S'Uil) 2 is called non- overlapping 
if A(x) n A(y) = for x,y £ A,x ^ y. Note that A(x) n A(y) = => x 
and y are incomparable. The converse is true if x, y £ S U fL However if 
x, y £ (S U fi) 2 then there is a counter-example, e.g., (A, 0) and (0, A) are 
incomparable but A(A, 0) fl A(0, A) = A(0, 0). In one-dimensional case, the 
notion of non-overlapping is equivalent to that of prefix-free. Throughout 
the paper we use the term "non-overlapping" . 

Proposition 2.1 a) monotonicity : x □ z =>■ Km(x\y) < Km(z\y), and 
y C z =>- i^m(x|?/) > i^m(a;|z). 

b) Kraft inequality: J2 x eA^~ Km ^ — Xl x e^2~ /<m ^ < 1 for non-overlapping 
set AC (Sun) 2 . 

c) Conditional sub-additivity: 3c Vx, ?/ £ S U f2, Km 2 (x,y) < Km(x\y) + 
Km(y) + c. 

Proof) a) Obvious, b) Let u be an optimal monotone function and p x £ 
{p|x jZ n(p)}. Suppose that A(x)nA(x') = and 3z, z = p x Vp x /. Then x C 
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u(z) and x' C u(z), which contradicts to A(x)flA(x') = 0. Thus {p x | x £ A} 
is non-overlapping for a non-overlapping set A. By setting p x to be an optimal 
code, i.e., |p x | = ifmjfx), we have J2 x( z^^~ Km < 1- Since Km 2 < Km, 
we have the statement, c) Let u be an optimal monotone function. Suppose 
that x C u(p,y), Km(x\y) = \p\ and ?/ C u(p'), Km(y) = \p'\. Let / : 
(5UH) 2 -> (SUQ) 2 such that f(pi,p 2 ) ■= (u(p 1 ,u(p 2 )),u(p 2 )) for all Pi,p 2 - 
Then / is monotone and Km 2 (x, y) < |p| + |p'| = Km(x\y) + Km(y). ■ 
Next we show Levin-Schnorr theorem for product space. Let A C S 2 be 
a r.e. set and 

A(x°°) := {x G i | x C x°°} for x°° G fi 2 . 

Before proving the theorem, we need conditions on A: 

x, y G A => x and y are comparable or A(x) fl A(y) = 0. (3) 

If ([3]) holds then for any A' C A there is a non-overlapping ^4" C A' such 
that .A" = A'. Note that it is possible A" is not r.e. even if A' is a r.e. set. 

x, y G A =>■ 3 non-overlapping a C A, A(x) fl (A(y)) c = a. (4) 



Lemma 2.1 //A is r.e. and satisfies O then for any r.e. A' C A there is 
a non-overlapping r.e. A" C A stzc/i £/ia£ A' = A". 

Proof) Since A' is r.e., there is a computable a' : N — >■ A' such that a'(N) = 
A'. Let «4"(0) = 0. Suppose that A"(n — 1) is a finite non-overlapping subset 
of A and A"(n - 1) = Ui<i<„_iA(a'(n)). Since A"(n - 1) is finite, from (TJ), 
there is a non-overlapping a(n) such that 

a(n) = A(a'(n)) n (A"(n - l)) c . (5) 

Since A(a'(n)) fl (A."(n — l)) c is compact and a(n) is non-overlapping, from 
Heine-Borel Theorem, we see that a(n) is finite. Let (3(n) := {z G 4 | A(z) C 
A(a' (n))r\(A" (n— l)) c }. Since A is r.e. and A"{n—1) is finite, /3(n) is r.e. from 
a'(n) and A"(n — 1). In particular, since a(n) C /3(n), we can compute a 
finite non-overlapping a(n) that satisfies (jSJ) from a'(n) and v4"(r?, — 1). Let 
»4"(n) := A"(n — 1) U a(n) then *4."(n) is a finite non-overlapping set. Let 
A" := U n A"(n). By induction, ^4" C A is a non-overlapping r.e. set such 
that A' = A". ■ 
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Theorem 2.2 (Levin-Schnorr theorem [H], I15L I16| on product space) 

Let P be a computable probability on (f2 2 , £>s2). Let A be a r.e. set that sat- 
isfies (T3j) and Q). Then 



x°° G 1Z P <= sup - logP(x) - Km(x) < oo, x 

xe^(x°°) 



oo 



x°° G 1Z P =>• sup - logP(x) - Km(x.) < oo. 



xe-4(x°°) 



The above statements hold for Km 2 . 

Proof) Suppose that x°° G' TZ p and x°^ = V.A(x°°). Then there is a test U 
such that for all n, x°° G t7„ and P(t7 n ) < 2~ n . Let L/£ := {y G *4|3x G 
^n, x E y}- Since t7 n and ^4. are r.e. sets, [/^ C A is a r.e. set. From 
Lemma l2~?Tj there is a non-overlapping r.e. set U" C .4. such that U' n = U'^. 
Since x°° = V.A(x°°), we have x°° G E^' and Vn, E/£ n A(x°°) ^ 0. Let 
P' be a measure such that P'(x) = P(x)2 n for x G and otherwise. 
Since P{U'£) < 2~ n , we have Exec/" -^"( x ) < 1- By applying Shannon-Fano- 
Elias coding to P' on 17%, we have 3ci,C2 > 0Vn3x G *4.(x°°) Km{x) < 
— logP(x) — n + if(n) + ci < — logP(x) — n + 21ogn + C2, where K is the 
prefix complexity. 

Conversely, let U n := {~x E A \Km(x.) < — logP(x) — n}. From (j3J), 
we see that there is a non-overlapping set U' n C t7 n such that U' n = U n . 
Hence P(U n ) = P{U' n ) < Exec/; 2" Xm ( x )- n < 2" n , where the last inequality 
follows from Proposition 12.11 b. Since U n is a r.e. set, {U n } is a test and 
n n t7 n C (TZ P ) C . The proof for .ffm 2 is the same as above. ■ 

Example 1 Let g : N — > N be a total-computable monotonically increasing 
function, where n < m =>■ ^(n) < g(m). Let 



Then ,A 9 is decidable and satisfies (j3J) and (Ilj). If g is unbounded then 

Vx°°, wy x °°) = x°°. 



Next we study a coding problem for multi-dimensional monotone complexity. 
The following lemma shows that if A is decidable and satisfies fl3]), we have 
the same one-dimensional coding as in |20j . 



A g :={(x,y)eS 2 \\y\=g(\x\)}. 



(6) 
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Lemma 2.2 Let P be a computable probability on (fl 2 ,Bs^) and let A C S 2 
be a decidable set that satisfies |3|) ; then there is a computable monotone 
function g : S U Q — > (S U Q) 2 such that 

3cVx G A(x°°), Am g (x) < - logP(x) + c. 

Proof) If A is decidable and satisfies ([3]) then, by rearranging an enumeration 
of A, we see that there is a computable / : N — > S 2 such that /(N) = ^4 and 
Vz, j, z < j, A(/(i)) D A(/(j)) = or /(i) C /(j)- Then we can construct a 
family of half-open intervals V/(j) := [a(i),b(i)) C [0, l],Vi G N that satisfies 
the following conditions: 0) V\ = [0, 1], 1) |V/(j)| = P(f(i)) for all z, where | V| 
is the length of the interval V, 2) if A(/(i))nA(/(j)) = then V f{i) C\V f[j) = 0, 
3) if f(i) C /(j) then V/(j) D V/y), and 4) a and 6 are computable, i.e., there 
are rational valued computable functions A:NxN->Q and B : N x N — > Q 
such that Vi,k, \a(i) - A(i,k)\ < 1/k, \b(i) - B(i,k)\ < 1/k. For s = 
Sl s 2 ---s n E S,\/i, Si E {0,1}, let I s := [£i<i<„ Si 2~\ Ei<i< n + 2 "™)- 
Then set F := {(s,f(i)) E S x S^i, C V> (i ),z G N}. We see that F is a 
r.e. set that satisfies (PQ). Let g be a computable monotone function defined 
by F, then we have g : S 1 U Q ->• (5 U fi) 2 and 3cVx G A(x°°), Fm 5 (x) < 
- log P(x) + c. ■ 
From Theorem 12.21 Lemma [2. 2 [ and Proposition 12.11 c. we have 

Corollary 2.1 Let P be a computable probability on (Q 2 ,Bs^)- If A C S 2 is 
decidable and satisfies (TJ|) and 0, then 

x.°°En p ^ sup |logP(x) + Fm(x)| < oo, 
xe^4(x°°) 

x°° G 1Z P <= sup | logP(x) + Fm(x)| < oo,x°° = V*A(x°°). 
xe^4(x°°) 

The above statements are true for Km 2 , and 

x°°6^^ sup \Km(x) - ifm 2 (x)| < oo 
xe-4(x°°) 

=>- sup Km(x,y) — Km(x\y) — Km(y) < oo. 

For 1-dimensional monotone complexity and its relation to other com- 
plexities, see [TO} [21] . In [5], a conditional complexity that is monotone 
with the conditional argument is defined. 
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Remark 1 It is not difficult to develop monotone function and complexity in 
an abstract way. Indeed, let A and A be partially ordered sets such that A is 
r.e. and A := {\/B\B C A}. Let F C Ax A be a r.e. set that satisfies <^ with 
respect to the partial order of A. Then we can define (optimal) monotone 
function / : A — > A in a similar way with Section [27TT For example, for x, y G 
(SUQ)°°, let x C y if Vz,x 4 C y l for x = (x 1 ,^ 2 , . . .),y = (y 1 ,?/ 2 , . . ■),x' l ,y l G 
SUfl Then (5Ufi)°° is a partially ordered set. Let A := {(x, A°°)|x G U fe S fc }, 
where A°° = (A, A,...) G S°°. Then A is a sub-partially ordered set of 
(S U f2)°° and A = (S U We can define computable monotone function 
/ : v4 —> A. For x = (xi, . . . , x n , . . .) £ A, let |x| := J2 n \x n \. Then iTm/ is 
defined. For example, let us consider discrete time (computable) stochastic 
processes Xi G Q,i = 1,2,.... Then their randomness and complexity of 
sample paths are modeled with a computable probability on (O 00 , 13 a) and 
Krrif, where #a := tr{A(x)|x G A},A(x) := {x°°|x C x°° G and 
computability of probabilities on (Q°°,Ba) is defined in a similar manner 
with finite dimensional case. 



Remark 2 Let <p l,t : (SUQ) 1 — > (S'Uil)* be an optimal monotone function for 
1 < /, t < oo. Then Km l \x x , ...,x t ) < Km l '^(x u ...,x t ) + 0(1) if Z' < Z, 
where Km 1 ' 1 is defined from z,t . If ^4 C S 1 *, £ < oo or A C {(x, A°°)|x G 
Ufc^}^ = oo is a decidable set that satisfies and (j3J then Theorem 12.21 
and Corollary 12.11 hold for Km 1 '* for 1 < l,t < oo. In order to simplify the 
argument, in the following discussion, we use Km. 



3 Section and relativized randomness 

Let P be a computable probability on X x Y = Q 2 . Let Px and Py be its 
marginal distributions on X and F, respectively, i.e., Px{%) = P(x,X) and 



P Y (y) = P(\,y) for x,y e S. Let 




and 



P(x\y°°) := lim P(x|j/), 
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for y°° G f2 if the right-hand side exists. For a subset A C X x Y and 
G Y, set 

:= {x 00 ^ 00 , G A}. 

For example, ^ = {x 00 ]^ 00 , y°°) G ft p }. Similarly, for B C S x 5, set 
Byoo := {x\{x,y) EB,yn y°°}. 

Theorem 3.1 ([19]) If y°° G TZ Py , then P(x\y°°) exists for all x G S, and 

P(-\y°°) is a probability measure on (Jl,B). 

Theorem 3.2 ([TO]) P(^ 00 |?/ 00 ) = 1 if y°° G ^ Py . H p y0a = £ TZ Py . 

Corollary 3.1 ([19J) IZ Px = U y0aen p Y TZ P oo. 

If P(-\y°°) is computable relative to y°°, then let TZ P ^ V '' V °° be the 
set of random sequences with respect to P(-\y°°) relative to y°°. In [19], 
{P(-|y°°)} 2/ oo is called uniformly computable if there is a partial computable 
A such that \/y°° G ft Py ,x G S, k G N3y C y°°, \P(x\y°°) - A(x } y } k)\ < 
1/k, i.e., P(-\y°°) is uniformly computable relative to all y 00 G 1Z Py . In 
[19] . it is shown that IZ p ^ y0 °^' y0C c 7?. P oo, and under uniform computability, 
7£-P(-ls/°°):3/°° = 72.^ for G 7£ Pi \ In the following we show the equiv- 
alence without assuming the uniform computability; we only assume that 
P(-\y°°) is computable relative to a given y°° G TZ Py . In order to show 
IZ p ^ y0a ^ y °° D 7?. P oo, first we extend a test U y °° w.r.t. P(-|t/°°) to a test w.r.t. a 
finite measure P' on f2 2 such that the section of the extended test at y°° 
coincide with U y °° and the total measure of the extended test w.r.t. P' is 
sufficiently small. Finally by using Markov inequality, we construct a test 
w.r.t. P. 

Theorem 3.3 Assume that y°° G TZ Py and P(-\y°°) is computable relative 
to y°°, then n p ^ y ^' yC ° = K p y00 . 

Proof) Fix y°° G IZ P . Since P(-\y°°) is computable relative to y°°, there is a 
partial computable function A : S x S x N — {<? G Q|g > 0} such that (al) 
Vx, k3y C y°°, \P(x\y°°) — A(x, y, k)\ < j- and (a2) if A(x, y, k) is defined then 
A(x, y, k) = A(x, z, k) for all y C z. Similarly, let U y °° C N x S be a Martin- 
Lof test with respect to P(-|7/°°) relative to y°°, i.e., C/ y is a r.e. set relative 
to and P(U y °°\y°°) < 2~ n for all n, where Uf := {x|(n,x) G E/t°°}. 
Then there is a partial computable function B:NxixS->5 such that 
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(bl) Vn, U%°° = {x\3i, y □ y°°, B(i, n, y) = x} and (b2) if B(i, n, y) is denned 
then B(i, n, y) = B(i, n, z) for all y □ z. 

Let U n := {(a;,j/)|3z, B(i,n,y) = x}. Then = Let U^C SxS 
be a non-overlapping r.e. set such that U n = U' n . Then U' ny0 o = U]£° . Let 

V n := {(x, 2, k) | (z, y)eU^ y nzeS,ke N, 

i < *, fc) or (k > 2" + ^l z, k) < -)}, (7) 

k 2 k 

V* xY := {(x,y) | (x,y,k) G Ki}. Then we have 

Or, y) e V n XxY =► c *, *) g K Xxy , (8) 
Wz°° G fi, V* Z X J is non-overlapping, (9) 

where (jHJ) follows from (a2); (Q follows from that U' n is non-overlapping; 
f fTUj) follows from that: from (al) and (a2), (i) if P(x\y°°) > then 3y C 

&Vy E2,J< 2, fc) and (ii) if P(x|y°°) = then VkBy C y°°Vy C z 

such that z, k) < -k 

Note that if ~ < \A{x,y,k) and y C y°° then |P(x|y°°) - A(x,y,A;)| < 
\ < lA(x,y,k), i.e., 

l -A{x, y, k) < P(x\y°°) < ^A(x, y, fc). (11) 



From V n , we can construct a r.e. set W n c5xSxN that satisfies (TT21) . 
(p|), (P|), and (ITBI) (Lemma O below): 

W n c K. (12) 

W* xY is non-overlapping, where W / r f xy := {(x , y)\(x , y , k) G W^}. (13) 
{ x i Vi k), {x, y, k') G W n =^ k — k 1 . (14) 
Vz°°Gft, ^(x,y,fc) < 3-2" n . (15) 

(x,i/,fe)GVK ri ,j/C2; 00 

Uf = W™T. (16) 

Let P'(x, z) := z, k)Py(z) for (x, y, fc) G W n , i/Cz and P'(x, y) := 
for (x,y) such that A(x,y)nW* xY = 0. Then by (USD, P'(^rf xy ) < 3-2" n . 
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Finally let 

U* xY :={(x,z)eSxS\(x,y)eW^ Y , y^z, 

P{x,z) < -P'{x,z) or P{x,z) < 2- n -\ x \p Y {z)}. 

Since W^ xY is r.e. and P is computable, we see that U^ xY is a r.e. set. 
Since xY is non-overlapping, we have E(i V rf xr 2~l x IPy(?/) < 1 and 

(x,y)GW^ xY 

From ([1211 . we have (x, y,k) G W n =>• \ < \A{x, y, k) or k > 2 n+|:r| , A(x, y, k) < 
\. Since P(x\y) P(x\y°°) as y -> y 00 for G ft Py (Theorem EHJ, we 
have for G W* xY ,y C (i) if | < |v4(x, k) then from (ITT]) . 

3y □ z C y°°,P(x,z) < \P'{x,z) and (ii) if k > 2 n+ ^, A(x,y, k) < \ 
then 3y H z n y°°,P(x,z) < 2- n -^P Y (z). Thus W**J C E/^T. Since 
E/* xy C W^f xy , from (USD, we have 

Since £/ Xxy := {(n,x,7/)|(x,y) G t/,f xy } is r.e. and Yjn P (P* xY ) < oo, we 
have limsup n t/,f xy C (JZ P ) C and T^^oo C 7^ p ^' J/0 °^ y °°. The converse inclusion 
is shown in [19]. ■ 

Lemma 3.1 There is a r.e. set W n that satisfies fHfy . p^j j, ( T73]) . and ( U6|) . 

Proof) We construct a r.e. set W n C 5 x S x N by induction. Let W(0) := 0. 
Suppose that W(t - 1) C V n is finite, W Xxy (t - 1) := {(x,z)\(x,z,h) G 
VK(i — 1)} is non-overlapping, and 

Vz°°Gfi, A{x,y,k) < 3-2" n . (17) 

Since W(t — 1) is finite, there is a finite non-overlapping set W Y such that 
U yeW vA(y) = n and a{A(y)\y G l¥ y } = a{A(y)\(x,y,k) G W(i - 1)}. 
Since V n is a r.e. set, let v : N — > V n be a computable function such that 
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V (N) = V n . Let 



w(t) := {(x, z',k) <E S x S x N \ v(t) = (x, y, k), 



3z G W Y , z 1 :—y\J z exists, 

W(t - 1) U {(x, z', fc)} satisfies ([17). 

W Xxy (t - 1) n A(x, z') = 0}, 



and W{t) := W(t - 1) U Let w Y := G w(t)}. Since I¥ r 

is non-overlapping, w Y is non-overlapping. Hence (i) if (x,z',k) G and 
z°° G A(z') then {(x, y, y, fc) G W(t),y C = {(x, y, fc)|(x, y, fc) G 
W(t- l),y C U {(x,z', fc)} and (ii) if z°° £ w Y then {(x,y, k)\(x,y, k) G 
C = {(x,y, k)\(x,y, k) G W(t-l),y C z 00 }, see Figured Thus 
(TTTj) holds for W{t). By induction, is finite and satisfies (JX7J) for all *. 

Since W(i — 1) is finite, we see that is decidable. Let W n := U t W(t) 
then W n is a r.e. set. Since Vt W(* - 1) C W(t), from (|T7J), we have (fT5j) . 
From (jSJ) we have (I12p . From the last condition of the definition of w(t), 
we have (03) and QH]). From ©, we have Exey XxV 2 ~ N ^ L Let K'« c 

n.y 00 

{(x,y,k)\(x,y,k) e V n ,y \Z y°°} such that (i) (x,y,k), (x,y',k') G 

y = y ^k = k' and (ii) (x, y, fc) G K, y C y°° 3y' C y°°, k', (x, y', fc') G V^. 

Then for any V^x, that satisfies (i) and (ii), from flTJ) and (ITT]) , we have 



Thus (x, y, fc) G V n , y C y°° =>■ By' C y°°, fc', (x, y', fc') G W n and hence 
V* y x J C From jTD]) and {T2]), we have (USD . ■ 

4 Likelihood ratio test 

Let P and Q be computable probabilities on Q. Let 



for x G S. We see that r is a computable martingale. By the martingale 
convergence theorem for algorithmically random sequences [19], we have 



A(x,y,k)<2P(U^\y°°)+ E 2 



< 3 ■ 2 
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Figure 1: This figure illustrates a construction of W n . For example, suppose 
that Wit - 1) = {(xi, yi, fci), (x 2 , 2/2, fo)}, « = A(xi, j/i), and /3 = A(x 2 , y 2 ) 
for some t as shown in the figure. W Y is illustrated by the partition on 
the y-axis. If v(t) = (x 3 ,y 3 ,k 3 ) and 7 = A(x 3 ,y 3 ) (the rectangle below a) 
then 7 is divided into 71 = A(x 3 ,yi) and 72 = A(x 3 ,z'). If A(xx, j/x, hi) + 
A{x 3 , y x , k 3 ) < 3 ■ then (x 3 , j/ a , **) <= and if A{x 3 , z\ k 3 ) < 3 • 2"" 

then (x3,^,fc 3 ) G W{t). 

Corollary 4.1 7?. p C lim^^oo r(x) < 00}. 
The following lemma was appeared in [3]. 

Lemma 4.1 Lei P and Q be computable probabilities on Q. 

a) : Tl p n TZ Q = TZ P n {x°°|0 < lim x ^ x oo r(x) < 00}. 

b) : TZ P n (ft°) c = ft p n lim^oo r(x) = 0}. 

Proof) a) If x°° G 1Z P R "ft then P(x) > and > for x C 

From Corollary 14.11 we have < lim a; _ i ., r oc r(x) < 00. Conversely, if x°° G 
Tl p n {x°°|0 < Hm^oo r(x) < 00}, by Theorem^ 

suPsca; 00 — log P(cc) — Km(x) < 00 and sup xl=:r0 o | — \ogQ(x) + logP(x)| < 00. 
Thus, sup^^oo — logQ(x) — Km(x) < 00 and we have x°° G 
b) From a, we have TZ P n (ft Q ) c = ft p n (ft p n TZ Q ) C = K p n ({limr = 
0} U {limr = 00}}) = 7£ p D {limr = 0}, where the last equality follows from 
Corollary 14.11 ■ 

Remark 3 Let g be an unbounded increasing total-computable function and 
<Ag,n '■= {( x ,y) I M — n , i x iy) £ ^-g}; where A 9 is defined in (jBJ). Let 
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T n := a{A(x,y) \ (x,y) G A g , n } and r n (x°°, y 00 ) := f$fi,{x 7 y) G i J>n . 
Then {r n } is martingale with respect to {.F n }. If we replace \im x ^. x oo r(x) 
with lim( a . il ,)_ > .( a .oo ,3/)e^ 9 (x°°,j/°°) r i x i y) m Corollary 14.11 and Lemma 14.1^ 
they hold for computable probabilities on ft 2 . 



Remark 4 In a similar manner with the proof of Lemma 14.11 a) , we have 
1Z P PI 7Z Q = 1Z P PI {a; 00 ^ < mf xCx oo r(x)}. If we replace inf^^oo with 
inffa;^)^ (a-oo^oo) for unbounded increasing total-computable g, it holds for 
computable probabilities on ft 2 . 

4.1 Absolute continuity and mutual singularity 

By Lebesgue decomposition theorem, there exists N E B such that P(N) = 
and 

VCGB, Q{C)= [ r{x°°)dP + Q{CC\N). (18) 
Jc 

We write (a) P _L Q if P and Q are mutually singular, i.e., there exist A and 
B such that A n B = 0, P(A) = 1, and Q{B) = 1, and (b) P < Q if P is 
absolutely continuous with respect to Q, i.e., VC G £> Q(C) = =>- P(C) = 0. 

Remark 5 By flH}, we have (a) P i_ Q iff P({limr = 0}) = 1, and (b) 
P < Q iff P({limr = 0}) = 0; for example, see [T4] . 

The following theorem appeared in pp. 103 of [12] without proof. 

Theorem 4.1 (Martin-L6f) Let P and Q be computable probabilities on 
ft. Then, K p n7Z Q = z# P 1 Q. 

Proof) Since P(TZ P ) = QiJtP) = 1, only if part follows. Conversely, assume 
that P _L Q. Let N : = {x°°\0 < lim inf a. Ca .oo r(x) < lim sup xl=x00 r(x) < oo}. 
By Remark O we have P(N) = Q(N) = 0. Since < liminf^ 
< infa; C:r oo r(x) and lim sup xCx oo r(x) < oo <=>■ sup xnx00 r(x) < oo, we have 

N = {x°°|0 < inf r(x) < sup r(x) < oo} 

xfZx°° 
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where TV"' = {x\a < r(y) < b,Vy C x, \x\ = i}. Since P(N) = 0, we 
have linij P{N^' b ) = 0. Since (N^ b ) c D {x|P(a;) > 0} is a r.e. set, we can 
approximate P(N^ b ) from above, and there is a computable function a(n) 
such that P(N^ n) ) < 2~ n . Thus, is a test of P, and hence, N C (ft p ) c . 

From Lemma [4. II a. we have ft p n Tl Q = 0. ■ 
From Lemma 14.11 b and Remark [3J we have 

Lemma 4.2 7^ C 7^ =^ P Q for computable probabilities P and Q on 
ft. 

There is a counter example for the converse implication of the above lemma, 
see [3]. The above results are related to Kakutani's theorem on product 
martingale [H [25] , see [3 [23] . 

4.2 Countable model class 

In the following discussion, let {P n } n6 N be a family of computable probabil- 
ities on ft; more precisely, we assume that there is a computable function 
A:NxSxN4Q such that \A(n,x, k) - P n {x)\ < 1/k for all n, k E N 
and x £ S. Note that we cannot set {P n }neN as the entire family of com- 
putable probabilities on ft since it is not a r.e. set. Let a be a computable 
positive probability on N, i.e., Vna(n) > and J2 n a ( n ) = 1- Then, set 
P := X]n a ( 77 -)-^"- We see that P is a computable probability. The following 
lemma is a special case (discrete version) of Corollary 13.11 

Lemma 4.3 K p = U n TZ Pn . 

Proof) Let Py(y°°) := a(n) and P'(x; y°°) := P„(x) if y°° = n 10°° and 
otherwise, respectively. Let P'(x, y) := J A ^ P'(%] y°°)dpY f° r x > V e 
then P' is a computable probability on X x Y = ft 2 . We see that P'x{ x ) — 
E n a(n)P„, TZ p y = {0 n 10°°|n e N}, and = ft p " if y°° = n 10°°. Since 
7?. p ^ = U d' 7£ p l fCorollarv l3.ll) . we have the lemma. ■ 
Let (3 be a computable probability on N such that 1) (3(n) > if n ^ n* 
and /3(n*) = 0, and 2) J2 n P( n ) = L Then ' set 

p-:=^/3(n)P n . 

We see that P~ is a computable probability. By Lemma I4TT1 and H~3"| we have 
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Corollary 4.2 

n p -* rv„, (n p -y = (u n n p -) n lim^- p-(x)/p n *(x) = o}. 

Let 

n(x) := arg max a (n)P n (a;). 

n 

In [TJ [2] , it is shown that lim a ._ 5>a; o P~(x)/P n *(x) = =>■ lim^a-oo = n*. 
Thus we have 

Corollary 4.3 K Pn * H n ^ n * (^ p ") c C lim^oc n(x) = n*}. 

The above corollary shows that if x°° is random with respect to TZ Pn * and 
it is not random with respect to other models then n classifies its model. 
Estimation of models by n is called MDL model selection, for more details, 
see [TJ Note that by Theorem 14. 1[ if {P n } are mutually singular, then 
TZ P "* n n ^ n « (TZ Pn ) c = TZ P "* , and by Lemma if P„* it P~, then TZ P ^ n n ^ n * 
(Jl p -f ^ 0. 

5 Decomposition of complexity 

It can be shown that 

sup \Km(x, y) — Km(x\y) — Km(y)\ — oo. (19) 

x,y£S 

The above equation shows that there is a sequence of strings such that the 
left-hand side of the above equation is unbounded. However, if we restrict 
strings to an increasing sequence of prefixes of random sequences x°°, y°° with 
respect to some computable probability and a convergence rate of conditional 
probability is effective, then we can show that the left-hand-side of (fT9"j) is 
bounded (see Theorem 15.11 below). 

Let P be a computable probability on X x Y = Q 2 . From Theorem 13. H 

Vx, P(x\y) -»■ P(x\y°°) as y -»■ y°° G TZ Py . (20) 

Observe that 

P(x,y) > 0, P(x\y°°) > if (x,y) C (x 00 ,?/ 00 ) G ft p . (21) 

This follows from that P(x, y) = (x, y) C (x°°, y°°) ^ TZ P . If P(x|y 00 ) = 

then from (120]) . we have Wn3y C P(x|y) < 2~™. Since ?7 n := {(x, j/)|P(a;, y) < 

2~ n P y (y)} is a test of P, we have (x°°, y°°) G n n U n if P(x|y°°) = 0. 
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If (x 00 ,?/ 00 ) G TC then from (I2U|) and (BIjl. we have 

P(x\v) 

Vx C x°°,/ > OBNVy C y°°, N < \y\ | ^ - 1| < /. 

P(x|y°°) 

By letting / be a function of \x\, we have for any / : N — > {q G Q|g > 0}, 
there is g : N ->■ NU{0} such that 

V(x,y) c (x 00 ,?/ 00 ), «/(|x|)<|y| =► - 1| < /(|x|). (22) 

In the above, g depends on / and (x 00 ,?/ 00 ). We say that the conditional 
probability P(-\y°°) is /, (x 00 ,?/ 00 ) effectively converges if there is a total- 
computable monotonically increasing g in ( 122]) . where we allow that g is 
bounded, see Remark [6j g is called effective convergence rate function. 

Lemma 5.1 Let P be a computable probability on X xY = Q 2 and (x°°, y°°) G 
Tl p . Let f : N ->■ {q G Q|0 < q < 1} suc/j too* J2 n f( n ) < 00 ■ 
sme t/iat P(-|t/°°) zs /, (x 00 ,?/ 00 ) effectively converges. Let g be an effective 
convergence rate function. Then there is a computable monotone function 
e : (S Utt) 2 ^ S UQ such that 

3c3p°° G fiV(z, y) C (x°°, y°°)3p c 

fi'(M) = |y| « E e(p,2/), |p| < -logP(x|y) + a 
3cV(x,y) C (x 00 ,?/ 00 ), g(|x|) = |y| Km{x\y) < -log P(x\y) + a (24) 
Proof) Let 

P'((%°°) := P(0\y) for |y| = g{l), y C 

P'ai^-l-P'^ly 00 ), 1 J 

and for x G 5 1 

P'(x(%°°) := P'( x \y^)^M if P(x|y) > for \y\ = g(\x\ + l),y\Z 
P'ixlly 00 ) := P'(x\y°°) - P'(x0|y°°). 

(26) 

Since P(x|y) > <^ \/(x',y r ) C (x, y), P(x'|?/) > and g is computable, 
we see that there is a partial computable A-.SxSxN^-Q such that 

Vy°°Vx,y,k, \P'{x\y°°) - A(x,y,k)\ < ^ if y C = M,P(x|y) > 0. 

(27) 
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Let D := {(x,y)\g(\x\) = \y\,P(x\y) > 0}. From (I2"7|) . we can construct 
a family of half-open intervals V( x<y ) C [0, 1], (x, y) G D such that 1) the end- 
points of V( x ,y) are computable with arbitrary precision form (x,y) G D and 
| Vfa;,y) | = P 1 ( x \y°°) i an d 2) if (x, y), (x', y') G D, and y and are comparable, 
then ijiCi'^. Kx',j/') c ^foiO' an d HA(i') = =>- V( x , y ) H V^/y) = 

0. Let F := {(a, y, x)| J, C V (XjV) , (x,y) E D}U {(s, y, X)\s, y G S}. Then F is 
r.e. and satisfies ([Q). Let e be the monotone function defined by F. Then 

Vy°°3cVx,y, Km e (x\y) < - log P'(x\y°°)+c if y C y°°,y(|z|) = |y|,P(x|y) > 0. 

(28) 

By replacing P(x\y) in (j25p with P(x\y°°), from (1221) . we have for |x| = 1, 

Similarly, by replacing P(xz|y) and P(x|y) in (T26]) with P(xz\y°°) and P(a;|y 00 ) 
respectively, from ff22l) . we have for 1 < |rr|, |^| = 1, 



1 - f(\x\ + 1) P'(x\y°°) P'(xz\y°°) P'(x\y°°) 1 + f(\x\ + 1) 
1 + /(N) P{x\y°°) ~ P(xz\y°°) ~ P{x\y°°) l-f{\x\) 

Therefore we have 

nlli(i -/(»)) < pwn < nLl(i + /H) if P(xlv > o 

dt^l + /(«)) " ^Iv 00 ) " nlli^l - /(«)) 

Since < n^iC 1 " /(")) < UZii^ + /(")) < ~ if En f( n ) < ™ and 
< / < 1, from ([281 . we have the lemma. ■ 

Theorem 5.1 Let P be a computable probability on X xY = Q 2 and (x°° , y°°) G 
7?. p . Lei / : N ->■ {<? G Q|0 < g < 1} snc/i #iat Y, n f( n ) < 00 ■ Assume i/iai 
P(-|y°°) is /, (x 00 ,?/ 00 ) effectively converges. Let g be an effective convergence 
rate function. Then 

sup \Km(x\y) + logP(x|y)| < oo, (29) 

(a:,3/)e.4 3 0°°,y°°) 

sup |iTm(a;,y) — Km(x\y) — Km(y)\ < oo, (30) 

where A g is defined in (Ej). In addition, if P(-\y°°) is computable relative to 
y°°, then 

sup Km(x\y) — Km(x\y°°) < oo. (31) 
(a:,y)eAg(.x° o ,y° o ) 
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Proof) From Corollary if (x 00 ,?/ 00 ) 6 TZ P then 

sup \ log P (x, y) + Km(x,y)\ < oo, sup | log Py(y) + Km(y)\ < oo, 

{x,y)eA B (x°° ,y°°) y^y°° 

(32) 

sup — log P(x\y) — Km(x\y) < oo. (33) 

(x,y)eA g (x°°,y°°) 

From ([23]) and ([33]) , we have fl2|. From ([32]) and ([1|, we have (13DD. 

If P(-|t/°°) is computable relative to y 00 , Theorem l3.3l holds. i.e., (x°°, y°°) G 
7?. p iff x°° G 'JZ p ('\y cc ')'y° c ^y°° e 1Z Py . By relativized version of Levin-Schnorr 
theorem, we have 

sup — log P(x\y°°) — Km(x\y°°) < oo, 

xnx°° 

for (re 00 ,!/ 00 ) G TZ P . Since i^m(x|?/ 00 ) < ifra(x|?/) for y C from 022 1) and 
(HHD, we have ([SI]). ■ 

Example 2 Let P' be a computable probability on ft. For x = x\ • • • x n , y = 
Hi' ' ' Dm G S , let 

A (a; © y) := {^i-2 2 • • • G ft | = Xi if i is odd and i < n, 

Zi = yi if i is even and i < m}, 

P(x,y):=P'(A(x®y)). 

Then P is a computable probability on X x Y = ft 2 , i.e., X and F are the 
spaces of odd and even coordinates, respectively. For x°° = X\X 2 ■ • • , y°° = 
ViV2--; let 

x 00 ©?/ 00 := x 1 y l x 2 y 2 ■ ■ ■ G ft, 

then 

(x°°, y°°) en p ^ x°° © y 00 g ft p '. 

From Theorem I3.3[ if the conditional probability is computable relative to 
y°° G 1Z Py then x°° © y 00 is random with respect to P' iff y°° is random and 
x°° is random with respect to the conditional probability at y°°. Let P' be 
a computable first order Markov process, i.e., P'(zi ■ ■ ■ z n ) = p z J^i = 2Pz i - 1 ,zn 
where Vi,j G {0,1}, < PuPij < l,J2iPi = l,Y^jPi,j = 1- We see tnat 
P(x|t/°°) = P(x\yi ■ • ■ y\ x \)- Thus g{n) = n satisfies ([22]) for any / and 
Theorem 15.11 holds. 
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Remark 6 In Lemma 15.11 and Theorem 15.11 g need not be unbounded if 
(1221) hold. For example if P := P X P Y then g := satisfies {22} for any /. 

5.1 Independence 

We show some equivalent conditions for independence of two individual se- 
quences. The following result shows that if (x 00 ,?/ 00 ) is random with respect 
to some computable probability (in [13] such a sequence is called natural), 
then we can represent independence of (x 00 ,?/ 00 ) in terms of complexity. 

Corollary 5.1 Let P be a computable probability on Q 2 and (x°°, y°°) G TZ P . 
Assume that P(-\y°°) is computable relative to y°° and f, (x 00 ,?/ 00 ) effec- 
tively converges for f = 1. Let Q be a computable probability such that 
Wx,y,Q(x,y) := Px(x)Py(y) ■ The following statements are equivalent: 

a) (x 00 ,?/ 00 ) G TZ Q . 

b) for any unbounded computable increasing g, 
BU P(x,»)6^(x»,w«») \Km(x,y) - Km(x) - Km(y)\ < oo. 

c) sup xl=x oo Km(x) — Km(x\y°°) < oo. 

Proof) a=>b: Every increasing computable g satisfies (122]) for Q. From Theo- 
reml5.11 if (x 00 ,?/ 00 ) G 1Z Q then sup^^^^oo) \Km(x\y)+\og Px(x)\ < oo, 
sup^^oo \Km(x) + logPx(x)| < oo, and (13"01) holds. Thus we have b. 
b=^>a: Let g be unbounded computable increasing function. Since 1Z P C 
K Px x K Py , 

(x°°, y°°) e7l p ^ S}1 V(x, y )eA g (x°°,y°°) \Km(x, y) + log P(x, y) | < oo, 
sup^^^oo \Km(x) + logPx(x)| < oo, 
sup ynyX1 \Km(y) + logP y (y)| < oo. 

We have < ini(x,y)eA g (x°°,y a °) p^ V l ■ From Lemma ETT1 (see Remark @J, we 
have a. 

a=^c: Since g := satisfies ( )22|) for Q, from Theorem 15.11 we have c, see 
Remark [6j 

c=^a: Let g be an unbounded effective convergence rate function for P(-\y°°), f = 
1, and (x™,y™) G Then we have < = for (x,y) G 

*4. 9 (x°°, y°°). From Theorem 13.31 and Levin-Schnorr theorem, we have 
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sup^^oo I Km (x | y°° ) + log P(x\y°°) \ < oo and sup xCx oo \Km(x)+\ogP x (x)\ < 
oo. From the statement c), we have < inf^^g^^oo ^oo) ■ From 
Lemma [4.11 (see Remark UJ), we have a. ■ 
Note that 7?. p fl 'RP 7^ iff P and Q are not mutually singular (Theo- 
rem HA]) iff P(limr > 0) > (Remark El). 

6 Bayesian statistics 

Let P be a computable probability on X x Y and Px, Py be its marginal 
distributions as before. In Bayesian statistical terminology, if X is a sample 
space, then Px is called mixture distribution, and if Y is a parameter space, 
then Py is called prior distribution. We show that section of random set 
satisfies many theorem of Bayesian statistics, see also [19], and it is natural 
as a definition of random set with respect to conditional probability from 
Bayesian statistical point of view. 

6.1 Consistency of posterior distribution 

We show a consistency of posterior distribution for algorithmically random 
sequences. We see that the classification of random sets by likelihood ratio 
test (see Section HJ) plays an important role in this section. 

Theorem 6.1 Let P be a computable probability on X x Y = Q 2 . The fol- 
lowing six statements are equivalent: 

a) P(-\y) 1 P(-\z) i/A(y)n A(z) = $,Py(y) > 0,Py(z) > fory,ze S. 

b) K p ('\v) r\K p W z) = ifA(y)nA(z) = ®,P Y (y) > 0,Py(z) >0fory,ze S. 

c ) Py\x{'\x) converges weakly to I y oo as x — >■ x°° for (2 00 ,?/ 00 ) G 1Z P , where 
Iyoa is the distribution that has probability of 1 at y°° . 

I) n^nn^ =®if y °°^ z °°. 

e) There exists a surjective function f : lZ Px —> TZ Py such that f(x°°) = y°° 
for (x 00 ,^ 00 ) G TZ P . 

f) There exists f : X -> Y and Y' C Y such that Py(F') = 1 and f = 
y°°, P(-\y°°) - a.s. for y°° G Y' . 

Proof) a b follows from Theorem 14.11 

b c : If (x 00 ,?/ 00 ) G TZ P , then x°° G ft p(|y) and P Y (y) > for y C y°° . If 
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A(y) n A(z) = and P Y (z) > 0, then from the statement b, x°° £ TZ p ^ z) . If 
Py{ z ) > then from Lemma 14. 14 we have lim^^oo P(x\z)/P(x\y) = 0, and 

P(x\z) , P(x,z) , Pylx(^k) / , N 

lim „; ( = ^ lim -j ' = <£> lim ^ 1 = 0. 34) 

x^x°° P{x\y) i-a" 1 P(x,y) x^tx°° P Y\x{y\x) 

If -Py(^) = then the last equation in (1341) holds. Hence the last equation 
in (|34|) holds for all z and we see that the posterior distribution Py\x('\ x ) 
converges weakly to I y <x>. 
c =^> d : obvious. 

d e : Since Tlyoo H T^foo = for 7^ we can define a function 
/ : X — > Y such that f(x°°) = y°° for x°° G 1Z P ^. From Corollary 13. 1[ we 



v 

have e, see Figure [2j 

e f : By Theorem I3.2[ we have f . 

f a : Let A y oo : = {x 00 ]/^ 00 ) = y 00 }. Then, A y oo n4» =| for ^ z°° 
and P(Aj /00 |?/ 00 ) = 1 for y°° G V. Thus, (U y » 6 A0/)A/ oo ) n ( u y oo 6A(*)A/ 00 ) = $ 
for A(f/)nA(z) = and P(U 2/ oo e A(j/)^4j/°°|2/) = P(U2 / <x> e A(z)A/°°l 2; ) = 1> which 
shows a. ■ 



Usually, consistency of posterior distribution is derived from f, see [Ej. 
Note that the statements a and f do not contain algorithmic notion. 

Example 3 Let {P(-; y 00 )} y oo g y be the parametric model of Bernoulli pro- 
cess, i.e., P(x;y°°) := r(y°°)^=i Xi (l — r(y°°)) n ~^^i Xi where x — x± ■ ■ ■ x n , 
y°° = yiy 2 ■ ■ •, and r(y°°) := YliVi^~ % - Let P Y be a computable probability 
on Q and P(x, y) := J A , y \ P(x; y°°)dP Y for x,y G S. Then P is a computable 
probability on Q 2 . By the law of large numbers, f (and all the statements) 
are satisfied. Note that the conditional probability P(-\y°°) is defined by P, 
see Section 4 in [TO]. In general, it is possible that P(-\y°°) 7^ P(-; y 00 ) at y°° 
of a null set. 



6.2 Algorithmically best estimator 

We study asymptotic theory of estimation for individual samples and param- 
eters from algorithmic point of view. 

Suppose that one of the statement of Theorem 16.11 holds. Then from 
the statement c, we have P(y\x°°) = 1 for y c y°°, (x°°, y°°) G TZ P . Since 
P{y\x) -> P(y\x°°) as x -> x°° if x°° G TZ Px , we have Ve > 0, y C y°°, 3x C 



23 



X 





















n p 







TZ Py y 
Figure 2: / : TZ Px ->■ TZ Py 



x°°, P(y\x) > 1 — e. In particular there is an increasing h such that Ve, y C 
y°°,x n. \x\ > h(\y\) => P{y\x) > 1 — e. Roughly speaking, the following 
theorem shows that if this happen then y is estimated from x of size h and 
if P(y\x) goes to then we cannot estimate y from sample size h. 

Theorem 6.2 Let P be a computable probability on X x Y = Q 2 . Let h : 

N — > N be an increasing computable function and A := = 
For each (x 00 ,^ 00 ) we have: 

a) If inf^ y)€Mx<x> yoo'j P(y\x) > 0, then there is a computable function p such 
that y = p(x) for infinitely many (x,y) G A(x°°, y°°), where p need not be 
monotone. 

b) Let f : N -> {q G Q|0 < q < 1} such that Y. n f( n ) < 00 ■ Assume that 
P(-\x°°) effectively converges for f and (x 00 ,?/ 00 ) G 1Z P , i.e., there is a total 
computable increasing h : N — > N such that 

\x\ = h(\y\) I ^ V } X \ - II < f(\y\). 
1 1 Vl 17 l P(y\x°°) 1 Vl u 

If inf ( x ,y)EA(x°° ,y°°) P(y\ x ) > then there is a computable monotone function 
p such that\f(x,y) G A(x°°, y°°), y C p(x). 

c) If (x 00 ,!/ 00 ) G 7?. p and inf^^g^ooyoo) P(y|x) = 0, i/ien t/iere zs no com- 
putable monotone function p such thatV(x,y) G A(x°°, y°°), y E pO*0- 

Proof) a) By applying Shannon- Fano-Elias coding to -P(-|x) on the finite 
partition {y\\y\ = /i~ 1 (|a;|)}, we can construct a computable function e and 
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a program p e S such that e(p, x) — y and \p\ = \— log P(y\x)~\ + 1. Here, e 
need not be a monotone function. Since \p\ < oo as x — >■ there is a p such 
that e(po,x) = y for infinitely many prefix x of x°°. Thus, p(x) := e(po,x) 
satisfies a. 

b) From fl23|) . there is a computable monotone function e and p & S such 
that W(x,y) C ^4(x°°, y C e(p, x). Let := e(p, x) then p satisfies b. 

c) As in the same way of f[3"3"j) . we have sup( x ^g^oo y0O ) — logP(?/|x) — 
ifm(i/|x) < oo. Since s\ipr xy \ e M x00 yoo^ — log P(y\x) = oo, we have 
svl P(x,y)eA(x oo ,y ix ) Km(y\x) = oo. If there is a computable monotone function 
p such that V(x,y) E A(x°°, y°°), y C p(x) then sup^^g^oo^oo) ifm(y|a;) < 
oo, which is a contradiction. ■ 

By definition, we have 

-log P(y\x) = -log [ Pixly^dPy^+log [ P(x|y°°)rfP y (y°°). (35) 
7a(j/) </y 

Let Py be a Lebesgue absolutely continuous measure. Let y be the maximum 
likelihood estimator. By using Laplace approximation with suitable condi- 
tions, if y G A(y) and ~ \ log \x\, then the right-hand-side of (1351) is 
asymptotically bounded, for example see P, and we have inf xCx oo P(y\x) > 0, 
where |y| = h~ 1 (\x\). Thus, by Theorem 16.21 a, we can compute initial 
|~! log | x |] -bits of y°° from x infinitely many times, which is an algorithmic 
version of a well known result in statistics: \y°° — y\ — 0(1/ 

Let be a large order function such that inf^^^oo P(y\x) = for 

\y\ = for example, set = [log|x|]. By Theorem 16.21 c. there 

is no monotone computable function that computes initial /i _1 (|x|)-bits of 
y°° for all x C x°°. If such a function exists, then y°° is not random with 
respect to Py and the Lebesgue measure of such parameters is 0. On the 
other hand, it is known that the set of parameters that are estimated within 
o{l/y/n) accuracy has Lebesgue measure [I]. 

Theorem 16.21 shows a relation between the redundancy of universal coding 
and parameter estimation; as in [18], if we set Py to be a singular prior, we 
have inf xCx oo P(y\x) > for a large order h~ x . In such a case we have a 
super-efficient estimator. 
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